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C~| . Abstract 

This paper presents an analytical framework to model fault-tolerance 
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in unstructured peer-to-peer overlays, represented as complex networks. 
We define a distributed protocol peers execute for managing the overlay 
and reacting to node faults. Based on the protocol, evolution equations are 
' defined and manipulated by resorting to generating functions. Obtained 

Q outcomes provide insights on the nodes' degree probability distribution. 
From the study of the degree distribution, it is possible to estimate other 
important metrics of the peer-to-peer overlay, such as the diameter of 
the network. We study different networks, characterized by three specific 
desired degree distributions, i.e. nets with nodes having a fixed desired 
' degree, random graphs and scale-free networks. All these networks are 

^ . assessed via the analytical tool and simulation as well. Results show that 

' the approach can be factually employed to dynamically tune the average 

attachment rate at peers so that they maintain their own desired degree 
■ and, in general, the desired network topology. 

iri . 

1 Introduction 

The mechanics of complex networks represent an insightful research domain for 
those that try to understand the behavior and the characteristics of a network by 
looking at its general (statistical) properties. Basically, the focus concerns the 
' organization and the interaction among multiple nodes in a dynamical system 

& . [3 HS1 The theory and methods of analysis can be applied in the same 

fashion to existing real and abstract networks belonging to several domains, 
e.g. biology, sociology, physics, computer science [32j [Hil US [HI [38] . Examples 
of statistical properties of common interest are the probability that nodes have 
a certain degree (i.e. the number of neighbours connected to a given node), the 
probability that a node has links with the friends of its friends (which allows 
to understand how much the network is organized in clusters), the average 
number of second (third, etc.) neighbours (which provides insights on the size 
of the network component of a given node), the network diameter, etc. All these 
metrics reveal some features of a given network, such as its ability to disseminate 
information and/or propagate viruses, its resilience to nodes' departure, its 
connectivity [Ml H Oil GUJ S3 ■ 
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As for computer networks, modeling peer-to-peer overlays as complex nets 
allows to understand the level of reliability, scalability and tolerance to faults 
of these overlays. This is basically the purpose of this paper. Specifically, we 
provide a framework to model self-organizing, unstructured peer-to-peer archi- 
tectures with periodical faults. Nodes of the network simply correspond to 
peers, while edges represent a communication connection between two peers 
[SI [HI H3 HOI EE I37J H2J H3J H5]. (Since nodes of the modeled network repre- 
sent peers in the distributed system, hereinafter the terms node and peer will 
be used as synonyms.) In general, a peer-to-peer network is characterized by 
specifying: i) the system model, i.e. the environment of execution of the peers, 
together with the types of faults they are subject to; and ii) the distributed 
communication protocol, i.e. how peers connect and interact with other nodes 
in the net. 

The peer-to-peer network is unstructured, in the sense that the overlay is 
constructed based on some general desired topology that does not depend on 
the contents being disseminated through the net [18]. Rather, local choices are 
made by each peer to manage its connections. This may lead to a non-optimal 
organization of the overlay, from the view-point of the content distribution. 
However, the costs for managing such overlays are very limited. Thus, unstruc- 
tured systems may have better performances in highly dynamic environments 

EH- 

The system is composed of a set of peers that may fail during the evolution 
of the network. Node failures are modeled as random variables characterized 
by an average failure rate, as usual. A node failure does not cause the complete 
removal of the peer from the network. Rather, the peer loses all its links. 
Based on the protocol we define, peers react to these disconnections by actively 
creating novel links with their non-neighbours, trying to maintain a specific 
desired degree. As mentioned, the overlay is unstructured; thus, it is assumed 
that a self-organizing mechanism is employed to govern the network dynamics. 
Hence, local decisions are taken by peers to manage disconnections, without the 
intervention of a central entity |27] . The procedure related to the discovery of a 
non-neighbour and the creation of a novel edge is periodically executed, based 
on another rate. 

Once having defined the system model and the distributed protocol peers 
execute, we provide a mathematical analysis on the evolution of the nodes' 
degree. This is accomplished by introducing an infinite set of differential equa- 
tions. Then, these equations are turned into a single differential equation by 
exploiting generating functions. Its solution allows to calculate the nodes' degree 
probability. 

The novelty of this proposal is essentially due to the dynamic behavior of 
peers. Classic works on complex networks usually concentrate on node removals 
without the possibility to resort to some counter-mechanism to be executed, 
corresponding to a dynamic and continuous reconfiguration of the network 
[551 [TU1 HJ [35] . Indeed, a "passive" behavior may perfectly model a viral propa- 
gation of diseases in human contact nets, denial of services in computer nets, and 
general sudden attacks in a network, which do not evolve during the period of 
the attack (or rather, the system evolution proceeds at a pace significantly slower 
than the attack). Conversely, this kind of approaches cannot model the typical 
interactions of self-organizing peer-to-peer architectures, commonly exploited in 
unstructured overlay management techniques, with peers being programmed to 
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dynamically react to (or prevent) possible node faults. The framework provided 
in this paper allows to determine the degree distribution at peers in presence 
of node faults (and link creation) which occur during the whole system evolu- 
tion. Concurrently, all the reasoning related to complex networks theory can be 
applied. 

We compare the mathematical model with results obtained from a simula- 
tive assessment that mimics the corresponding distributed protocol. We vary 
the nodes' desired degree distribution. Specifically, we study three classic (de- 
sired) network topologies: i) uniform networks where all nodes have the same 
desired degree, ii) random graphs, iii) scale-free networks. Results show that the 
two different (theoretical and simulative) approaches provide similar outcomes, 
hence confirming the correctness of the proposal. Not only, they provide insights 
on the degree that peers succeed to maintain in presence of node faults. In fact, 
being the network continuously affected by node faults, and being nodes able 
to create novel links based on local (self-regulated) choices, it turns out that 
peers can maintain their own desired degree only when a high attachment rate 
is utilized (w.r.t. the failure rate). Once the degree distribution has been calcu- 
lated, given the system settings, it is possible to estimate the average number of 
second (third, etc.) neighbours, as well as the average size of the component a 
peer is connected to. In particular, we estimate the diameter of the considered 
networks. 

Of course, ensuring that peers have an actual degree equal (or similar) to 
their desired degree is mandatory to guarantee that the structure of the peer-to- 
peer network corresponds to its desired topology. Hence, the provided analytical 
tool can be factually exploited at peers to dynamically identifying a proper at- 
tachment rate they might maintain during the distributed interactions, based on 
the experienced node failure rate. Simple algorithms may be thus implemented, 
that allow to adapt the attachment rate. 

The remainder of the paper is organized as follows. Section [5] presents the 
distributed protocol we consider. Section [3] describes the analytical modeling 
of such a protocol. In Section 01 results coming from a simulation study are 
outlined. These outcomes are compared with the numerical results obtained 
through the presented model. Finally, Section [5] provides some concluding re- 
marks. 

2 The Distributed Protocol 

Consider a distributed system composed of a set of peers II. Communication 
among peers occurs through an overlay network. The system is faulty, in the 
sense that nodes may fail during their interaction with other ones. When a 
node failure happens, the peer loses all its links with its neighbours. After the 
failure, the peer is instantaneously able to create novel link connections, i.e. the 
time needed by the peer to restart its local system and re-join the network is 
assumed to be negligible. 

In the model, we consider node faults, rather that link faults, since in an 
unstructured peer-to-peer system it is more likely that a peer fails, rather than 
a single edge of the graph permanently fails. A node may fail because of a 
voluntarily action taken by the user that decides to leave the network, or when 
the peer remains isolated from the rest of the network, due for instance to some 
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Algorithm 1 Distributed Protocol: Attachment Process 

vars: actualDegree: current degree of the node executing the protocol 
dd: desired degree of the node executing the protocol 

precondition: actualDegree < dd 

found = false; 
while (-1 found) do 

p = NonNeighbourDiscovery(); 
sendLinkCreationRequest(p); 
ans = receiveAnswerQ; 
if (ans == "ok") then 
found = true; 
createNovelLink (p) ; 
actualDegree H — h; 
end if 
end while 
waitRandomTimeQ; 



technical problems which prevent that node to communicate with its Internet 
Service Provider, or when it loses its network coverage (hence losing all its con- 
nections with the rest of the world). Conversely, while still possible the removal 
of a single link in a peer-to-peer overlay network (with both peers remain- 
ing active) should be less frequent. Of course, TCP/UDP connections among 
two hosts, representing the transport-layer implementation of a link among two 
peers, may be interrupted due to several reasons. However, from a networking 
point of view, several techniques can be exploited such as, for instance, session- 
layer protocols, which augment the reliability of an end-to-end communication 
[2T1 Hi- 
Due to the dynamic and evolving nature of the network, we enable peers 
to create novel links with non- neighbours; this is accomplished through a local, 
random choice taken by the peer. Peers have a specific chosen degree and 
try to maintain it during the system evolution, in spite of nodes' faults. In 
substance, nodes select a desired degree (dd), whose value might depend on the 
specific characteristics of the node, e.g. computational and network capacities, 
role of the node in the network. When modeling the network, to characterize its 
desired topology, dd values will be assigned to nodes by utilizing some statistical 
distribution. As an example, for the sake of load balancing, peers' dds could be 
forced to assume values within a limited range (or a single value). Instead, the 
use of other desired degree distributions, such as power laws (typical of scale- free 
nets), would mimic hybrid multi-level peer-to-peer networks with the presence 
of hubs/super-peers. 

Based on their dd, during the system evolution peers that have an actual 
degree lower than such a value periodically start a discovery process to find a 
novel neighbour. We assume that when a peer asks another one to establish a 
novel link in the overlay, the latter refuses it only if its actual degree is equal to 
its dd. Otherwise, it accepts the link creation. 

The distributed protocol discussed above is summarized in Algorithms [l][21 
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Algorithm 2 Distributed Protocol: Upon Request for a Novel Link 
vars: actualDegree: current degree of the node executing the protocol 
dd: desired degree of the node executing the protocol 

precondition: message received for link creation 

p = sendingPeerQ; 

if (actualDegree < dd) then 

sendPositiveAnswer{p); 

createNovelLink (p) ; 

actualDegree ++: 
else 

sendNegativeAnswer(p); 
end if 



Basically, when the actual degree of a node is lower than dd (the precondi- 
tion in Algorithm [T]), a discovery process is activated to find novel neighbours. 
Algorithm [T] does not report a specific implementation of the discovery of a non- 
neighbour, since several alternatives are possible, not strictly dependent on the 
protocol under consideration. We just basically assume that the selection of the 
new neighbour is accomplished by randomly picking up a peer, as made in most 
unstructured peer-to-peer overlay networks [23 123 1301 E31 HO] ■ To find the novel 
node, a distributed oracle (or some approximation of it, obtained through local 
interactions) is employed which provides the complete list of active peers. Once 
a novel peer has been found, a request is sent to that peer. If a positive answer 
is received, a novel link is created. Otherwise, the node looks for another peer. 
Note that in the pseudo-code a random sleep has been inserted, to state that 
such procedure should be periodically executed while the node seeks to reach 
an actual degree equal to its dd. 

Algorithm [5] is executed upon request for a novel link from a non-neighbour. 
The behavior is quite simple, if the receiving node has an actual degree lower 
than its dd, it accepts the request and a novel link is created. Otherwise, it 
refuses the request. 

3 Modeling the System as a Complex Network 

In this section, we show that the presented system can be modeled as a complex 
network, through the use of differential equations and generating functions. 
Nodes' failures are modeled as random variables characterized by an average 
rate </>. Moreover, we assume that the rate of creation of a novel link is controlled 
by the parameter a. It is the difference between a and </> that determines how 
peers react to failures. The attachment and failure rates a, <f> do not depend 
on any specific characteristics of the peers (e.g. node degree). This means that 
the model does not consider any form of preferential attachment, which would 
privilege nodes with higher (lower) degrees [36], neither that nodes with higher 
(lower) degrees are likely to fail, i.e. those nodes that have much (less) workload 
in the communication network. 
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3.1 Preliminaries and Methodology 

Here, a general overview is provided on the methodology employed to model the 
distributed protocol. The idea is to define the evolution equations describing 
how the system evolves in time. In practice, for each possible degree, a differen- 
tial equation is defined which characterizes the probability that a peer, having 
such a degree, may change its state. The model will be composed of an infinite 
set of simultaneous linear differential equations (one for each possible degree). 
These equations will be turned into a single differential equation by exploiting 
generating functions. 



A probability generating function is of the form F(x,t) = J2i>o-Di{t) 



x 



where Di(t) is the set of coefficients composing the power series (in our case, 
these coefficients are the probabilities of having a certain degree i, at time t), 
while a; is a dummy variable, employed for pure algebraic purposes. F(x,t) 
captures all the information present in the original sequence Di(t), as each of 
these probabilities can be recovered by simple differentiation: 



1 d l F 



x=0 



The notation [xi]F represents the coefficient associated to the term x % in the 
power series. 

In general, many properties can be obtained by evaluating some manipula- 
tion of the generating function, at x = 1. For instance, having probabilities as 
coefficients of the power series, a check to perform is to assess whether the sum 
of all coefficients in F equals 1, i.e F(l, t) = 1. Moreover, the average of the co- 
efficients composing the generating function can be measured by evaluating the 
partial derivative with respect to x, F x = ^ at x = 1, i.e. F x (l, t) = iDi(t). 

Other useful algebraic properties, which will be used in the rest of the paper, 
and easy to verify, are the following ones 

+ i)A+i(*y = f x Yl =xf x , 53 A-i^y = xf. 

i>Q i>0 i>0 

(1) 

Then, rules of power series state that if \$i]A = a*, \xi]B = bi 

x 3=0 3=0 

The use of generating functions will hence allow to consider a single differ- 
ential equation which comprises all the evolution equations of the model. From 
its solution it will be possible to extract the elements of the power series, i.e. the 
degree distribution. 

In the following, we will also consider the system in its steady state, i.e. in 
the limit t — >• oo. This in fact enables to calculate the probability that a node 
has a given degree in the stationary state. Moreover, it avoids the presence of 
the partial derivative of the generating function with respect to the time variable 
t, hence simplifying the mathematical analysis and the related discussion. 
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3.2 The Protocol in Differential Equations 

Let Dij(t) = P(deg — i\dd — j, at time t) denote the probability that a given 
node at time t has degree equal to i, knowing that its desired degree is j. Note 
that, following the protocol, peers with an actual degree equal to their desired 
degree do not accept novel links; hence, a probability higher than is possible 
only when j > i. In general, the evolution of the degree of a given peer can be 
modeled, using Di t j(t), as 

cf>(i + l)D i+hj (t) + 0^,o + 2aA-ij(*)+ 

-[<j>{i + l) + 2a]D i>j (t) i<3 



dD itj (t) = 



dt 



4>5 lfi + 2aA-i,i(*) + ~4>{i + l)A,i(i) i = j 
i> j 



(3) 



In ([3]), a distinction is made between three cases, depending on the values of 
i and j. The case i < j corresponds to the case when the node has a degree 
lower than its desired degree. Hence, the first term on the right of the equation 
corresponds to the probability that the considered peer has degree equal to i + 1 
and one of the i + 1 neighbours fails. As a consequence, the node passes from a 
degree equal to i+1 to i. The second term considers the probability that the peer 
fails, thus increasing the number of nodes in the network with degree equal to 0. 
The third term accounts for the probability that the peer has degree i — and it 
either decides to create a novel connection with a non-neighbour, thus increasing 
its degree of one novel edge, or also that another peer asks the considered one 
to become neighbours. Note that in this case we do not insert any limit on the 
number of non-neighbours, assuming that the total number of nodes is high (or 
tends to oo); such an assumption is quite common in complex networks theory 
[36] . The remaining terms have the same meaning of the preceding ones, but 
account for those cases when the node has degree i, and itself or one of its i 
neighbours fail (hence, its degree downgrades to or i — 1, respectively), or 
when a new edge is created between the considered peer and another one, and 
the peer already has i neighbours (hence, its degree upgrades to i + 1). The 
case i = j considers only those transitions discussed above that correspond to 
degrees equal to i or i — 1, avoiding the probability of having a transition from 
(to) a degree equal to i + 1 > j (again, not possible). As previously stated, the 
case i> j (i.e. an actual degree higher than the desired degree) is not possible 
due to the protocol executed by peers; hence, the probability is 0. As a final 
remark, in ([3]) it is assumed that the probability that two transitions occur 
simultaneously is negligible, as usual. 

As mentioned, it might be interesting to consider the system in its steady 
state, assuming the existence of the limit Aj = linit_>oo A,j(£)j which im- 
plies that the variation on the probability to have a certain degree goes to 0, 
i.e. dD g 3 t — = 0. Equation (|3|) thus becomes 

<f>(i + 1) A,i = <Mi,o + 2oA-i,i i=j (4) 

[4>{i + 1) + 2a] A,j = <t>(i + 1) A+i,j + <Mi,o + 2aA-ij i<3 (5) 

To solve these equations using generating functions, consider for the moment 
the auxiliary system of equations obtained by ignoring the limit imposed by the 
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desired degree. Let hence use different coefficients Z)j,j (it will be possible to 
derive Dij, once having determined A,j)- The equations to manage are 

[<f){i + 1) + 2a]A,j = 4>(i + + <t>Si,o + 2aA-i,j. (6) 

There are two indexes associated to coefficients Dij, i.e. the actual and the 
desired degree of a given node. Therefore, we employ a 2-variable generating 
function 

F(x,y)= />-,•'•'.'/'. 

i,j>0 

where x controls the actual degree of the peer, while y controls the desired 
degree of the node. 

Now, multiply (|6|) by x l and y J and sum over all i, j > 0. The result is that 
the infinite set of simultaneous differential equations is turned into a single, 
novel differential equation for the generating function F, 

<j,{x-l)F x + [<t>-2a{x-l))F=-^—. (7) 

1 -y 

Such an equation is obtained by exploiting properties of generating functions 
(fT]) and observing that ^ 4 8iflX l y3 = jzt^- mentioned, F t is not present 
since we are considering the system directly in the steady state. It is possible 
to verify that a solution of this differential equation is 

where -Fb is an initial function to be determined, based on the boundary condi- 
tions. 



3.3 Degree Probability 

The obtained function F is an unfortunate one, since it is not defined for x = 1, 
and we already mentioned that many properties might have been obtained by 
evaluating some manipulation of F measured at i = I. However, given 
the elements composing the generating function can be extracted by employing 
classic results of power series. In particular, we may first assume that Fq can 
be expanded in power series, i.e. Fo(y) = J2j>o c jy° ■ Then, observe that 

= A V i 3 

2a(l-x)(l-y) 2a ^ Xy ' 
and, due to the mentioned rules @ of power series, we have 

where e n (r) is the exponential sum function e„(r) = X)fe=o T\' ^ combining 
these results, a general formula is obtained for the elements of the auxiliary 
system, which is 

Di,i = [xiyj]F = ^- - c J e ;(^r)- ( 9 ) 



F e T 
1-x 
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It is now possible to calculate Di.j from Dij, by determining coefficients 
Cj in (O, such that D^j = Di^ when i < j, and also in order to satisfy the 
boundary equation ([4]), considering the case i = j. In particular, when i = j, 
comparison of equations (0| and ([5]), shows that if Di.i = Di.i is true, then it 
must be 

2aA,i = cf>(i + l)D i+lti . 
From this last equation, coefficients Ci are determined, 

(j) (f>(i + 1) - 2a 



O / \ / \ 2+1 ' 

Thus, 



l)_2a]e,(^)+$($) 



A, = ±-c,e,(^). (10) 

Now, Dij represents the probability that a node has an actual degree equal 
to i, knowing that its desired degree is j. To find the probability Di that a node 
has degree i, it is thus sufficient to employ the formula 

A = p ( de 9 = Add = j)P(dd =j)=J2 D ^ p (dd = j), 



once having specified a desired degree distribution P(dd = j), j > 0, during the 
design of the peer-to-peer system. 



3.4 Nodes at Distance m, Network Diameter 

Once having obtained a degree probability distribution for the considered net- 
work, interesting measures to calculate are the mean number of first, second 
neighbours, and generally the number of neighbours at distance m from a given 
chosen peer. These metrics have in fact a great importance to understand how, 
and how fast, the network is able to disseminate information in a peer-to-peer 
network. 

Of course, having the degree probability distribution, the average number 
of first neighbours z\ of a given peer, i.e. the mean degree, can be calculated 
as z\ = (k) = ^2 k kDk- Then, an important result is that if the network 
exhibits a small clustering, the probability that one of the second neighbours of 
a peer is also a first neighbour of it, is negligible in (very) large networks [35]. 
This allows to easily calculate the mean number of second neighbours as = 
^2 k (k — \)kDk = (k 2 ) — (k). In general, the number of neighbours at distance 
m, can be estimated as z m = (z2/ 'zi) m_1 z\. Moreover, when Z2 > z\ the net 
exhibits a giant component which, roughly speaking, connects the majority of 
nodes in the network (the reader may refer to |35| for a complete discussion). 

A method to construct a network with small clustering, regardless of the 
desired degree distribution, is as follows. For each node i in the network, assign 
its desired degree ddi , following a desired degree distribution. Then add to it ddi 
stubs, representing the end of the links it would like to maintain. Finally, create 
links by randomly connecting stubs of different nodes. This is the approach 
we adopt to create and simulate networks with different desired topologies (as 
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discussed in the next section). Using these networks, it is hence easy to calculate 
z m values. The reader might argue that these nets do not represent "real" 
existing peer-to-peer systems. Indeed, one might think at several examples of 
peer-to-peer architectures which do have clusters. In such the obtained 

results represent upper bounds of the real estimations of z m . 

In any case, when Z2/Z1 3> 1, there is an average distance I representing the 
number of hops needed to reach a node, starting from another one |35j . Put in 
other words, the number of nodes reachable within I hops is almost the total 
number of nodes in the network |II|; hence we have 

|n N , i = py- 1 , 1 ^^ 1 pW^ + i. (ii) 

\zxJ log(z 2 /zi) 

In |35| it is argued that based on empirical results, estimations obtained using 
this last formula are close to correct measurements for several real networks. 
Hence, we will use (fTTj) in Section |4j to have an estimation of the diameter of 
our considered peer-to-peer overlays. 



4 Experimental Assessment 

This section presents an assessment performed to validate the model discussed 
in the previous section and evaluate the ability of the outlined peer-to-peer sys- 
tem to cope with node faults. A comparison is performed between the analytical 
model and results obtained through a simulation of the distributed protocol. As 
shown in the reminder of the section, the two approaches provide very similar 
outcomes. The employed approaches are very different, being the former purely 
analytic while the latter a simulator that mimics the distributed protocol ex- 
ecuted by a number of peers. Hence, the similarity on the obtained results 
confirms that the final equation of the mathematical model can be easily em- 
ployed to characterize the fault-tolerance and thus the reliability of a system 
having a defined desired topology. 

As to the desired degree distribution, we consider three different distributions 
and vary their related parameters. Namely, the three considered scenarios are: 
i) a fixed desired degree distribution, which would produce a uniform graph 
with all nodes having the same number of links; ii) a classic random graph 
where nodes are connected with others with a certain probability |35j : iii) a 
power law distribution, which would create a scale- free network [7J [3J [TS] . 



4.1 On the Simulator 

A discrete-event simulator has been built to model the defined distributed pro- 
tocol. It has been implemented in C code, by exploiting the GNU Scientific 
Library (GSL), a library that provides implementation of several mathematical 
routines for numerical and statistical analysis, such as pseudo-random genera- 
tors [T] . The simulator provides the possibility of generating a varying number 
of nodes. During the initialization phase, a random network is created based on 
the chosen desired degree distribution. Different techniques can be employed to 
create such a random network [35| [TSJ |8| . As already discussed in Section 13.41 
in this case once having assigned a specific desired degree to each node, based 
on the specific desired distribution, a random mapping is made so that links 
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are created until each node has reached its own desired degree. Hence, at the 
beginning of the evolution nodes already have the number of links they would 
like to maintain (this generally affects only the transient part of the simulation) . 

The simulator creates a network with a fixed number of nodes. This eases 
the measurement of the degree nodes have in time, without the need to consider 
novel nodes that join the network during the execution of the protocol. Hence, 
once a peer fails, it is not removed from the network; rather, all its links are 
removed. From that moment, the node will try to create novel links with novel 
peers, searching to reach its desired degree. 

After the network initialization phase, the evolution of the network starts. 
Nodes' failures and the discovery of other nodes for the creation of novel links 
have been implemented as Poisson processes, whose rates are regulated by the 
parameters a and 0, respectively. The shown results represent the status of 
the system after a specified simulation time. The length of the simulation was 
10 4 simulation steps. When not differently stated, the number of nodes was set 
equal to 1000. For each specific configuration, we ran 30 different experiments. 
Shown outcomes correspond to average results. 

4.2 Degree Distribution of Fixed Desired Degree Networks 

The first type of generated networks was based on a fixed dd, i.e. peers have the 
same value of desired degree dd = n. Forcing peers to have the same desired 
degree dd allows to model those classic scenarios in peer-to-peer environments 
where the software running on peers is configured to have a given number of 
links in the overlay, i.e. dd. This is quite common in real peer-to-peer systems 
and it is usually accomplished for load balancing purposes |43| . 

The model restricts the event space to the case when all nodes' desired degree 
is constant, dd = n; an obvious consequence is that D^j = 0,j ^ n. Moreover, 
due to the distributed protocol, Di.j = 0, i > j. Hence, the sum of all the values 
of Di^ n when i is varied, restricts to $^< n -D,- )n = P{deg — i\dd = n) = 1. 
In this case, we can hence simply consider in the model the values of Di n = 
P(deg = i\dd = n), for a fixed n. 

Figures[l][2]show the probability that a given node has a certain degree, based 
on the parameters a, <f>. All figures report both the node degree probability itself, 
as well as the cumulative probability, i.e. the probability that a node has a degree 
less or equal to the considered value. For these two metrics, two measurements 
are reported, obtained by using Equation (|10p and through simulation. We 
concentrate on two different types of networks, corresponding to two desired 
degree values, i.e. dd = 30 (Figurc[T|) and dd = 100 (Figure^). As shown below, 
the two networks have similar behaviors for the selected values of the rates a, cf>; 
the same holds for other similar dds. 

By looking at figures, a first consideration is that similar results are obtained 
using simulation and the mathematical model. Then, very different outcomes 
are measured, depending on a, 4> values. In particular, when the value of the 
failure rate (f> is higher than attachment rate a, in the steady state only low 
degree values have a probability significantly higher than 0. This can be appre- 
ciated by looking at the first chart of Figures QJ2J where a = 0.1,0 = 0.2. In 
both cases, degree values that take some non-negligible probabilities are those 
that range in the interval — 6. The cumulative probabilities, in the considered 
scenarios, reach values near to 1 at very low values. This basically means that 
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Figure 1: Degree probability and cumulative degree probability; results obtained 
through simulation (Sim) and the mathematical modeling (Th); a = 0.1,0 = 
0.2, dd^ 30 



in the steady state almost all peers tend to have experienced some failures and 
they do not succeed in maintaining the desired network topology. As mentioned 
before, our assumption is that peers instantaneously come back in the system 
and try to create some novel links, yet without being able to gain some notice- 
able degree. This is due to the low value of a. Moreover, since non negligible 
values are very well below the considered desired degrees, the obtained charts 
reported in Figures [T] and [5] are mostly equal (but they are indeed slightly dif- 
ferent), since the dd value does not act as a bound for the link creation. These 
first discussed results demonstrate that peers must be able to react to changing 
conditions of the system and self-organize. In fact, a can be interpreted as a 
basic parameter that regulates how a peer is active in the network. 

Things start to change when a takes values higher than (f>. These settings 
mimic those situations according to which peers actively create links, more 
rapidly than failure rates. The second charts in Figures Q][2] show results when 
a = 0.8, while keeping <j) equal to 0.1, lower than a. In this case, non-negligible 
degree probabilities may be observed for degree values higher than those ob- 
tained before, yet still without reaching the desired degree (this is more evident 
when dd = 100). It may be observed that, in this particular scenario, results 
from the simulation and the mathematical modeling are not perfectly identi- 
cal, but slight differences can be appreciated. In substance, simulations show 
that nodes tend to have a lower degree than that predicted by the mathemat- 
ical modeling. Nevertheless, obtained results are well below the nodes' desired 
degree. 

Results completely change when <j> is selected quite below the value of a. In 
these scenarios, in the steady state the probability that a node has a certain 
degree is mostly uniform for all degrees in the range between and the nodes' 
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Figure 2: Degree probability and cumulative degree probability; results obtained 
through simulation (Sim) and the mathematical modeling (Th); a = 0.1,0 = 
0.2, dd= 100 

desired degree. This can be appreciated by looking at the two final charts of 
the considered figures. In particular, with the following setting a = 0.5, <j> = 
0.01, dd = 30, it is quite probable that in the steady state nodes have their 
desired degree, while with dd = 100 probabilities of degree values lower than 
dd are almost uniformly distributed. When <fr — 0.001, instead, the probability 
of having a degree equal to dd in the steady-state reaches a high value also if 
dd = 100. In substance, under this setting, the desired network topology is 
maintained in the steady state. 

Figure [3] shows the estimated diameter of the networks obtained when run- 
ning the distributed protocol with an average attachment rate a = 0.5, while 
varying the value of <f>, assuming a network composed of 1000 nodes. The chart 
also reports the average number of first neighbours z\ = (k) and of the second 
neighbours zi (measured through the analytical model). It can be observed that 
the number of second neighbours is higher than the number of first neighbours, 
when a > (f>. Hence, when employed on large networks, the protocol allows the 
creation of a giant component. Note that when <\> has low values, the diameter 
is very limited and nodes succeed in maintaining a very high degree value, since 
the network is composed of only 1000 nodes, while the desired degree of each 
peer is equal to 100. This confirms that a proper attachment rate may guaran- 
tee that contents can be rapidly disseminated through the overlay, whatever the 
communication strategy employed on top of it. Then, as the failure rate grows, 
there is a growth also on the network diameter. It is however worth noticing 
that as <j> grows, the ratio z 2 /zi decreases. Thus, the estimation of the network 
diameter might be less reliable. 
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Figure 3: Diameter, average number of first and second neighbours of fixed 
desired degree networks, when varying <f>, calculated using Equation (fTTj) 



4.3 Degree Distribution of Random Graphs 

Here, we consider random graphs to model the desired degree distribution of 
networks. This is a generalization of the approach described above, with peers 
all having the same probability to attach to other links. In substance, when a 
random graph is generated, a link between each pair of peers is created with a 
certain probability p. The average degree is thus (k) = p|n|. It is well known 
that when the number of peers |IT| is large, nodes' degrees of random graphs 

may be well characterized using a Poisson distribution ^-fj • Several works 

employ this construction tool for generating random graphs |35) . 

Figure H] shows the degree distribution through the analytical model (and 
simulation) obtained in the steady state (after the mentioned number of simu- 
lation steps) , when the desired degree distribution models a random graph with 
a probability p = 0.2 and with a number of nodes |II| = 1000. Figurc[5j instead, 
reports results when p = 0.8. As shown in both figures, when parameters are 
set as a — 0.1,0 = 0.01, a non-negligible probability is obtained only for val- 
ues lower than 30, being nodes not able to reach the average desired degrees. 
Similar outcomes are measured when <j) is decreased down to 0.005; in this case, 
non-negligible values are obtained for degrees up to 50. Hence, in this case the 
desired topology is lost in the steady state. 

The two considered types of random graphs behave differently when the 
setting is a = 0.5,0 = 0.005 (third chart of Figures 0][5]) . In fact, as shown in 
Figure[4j with p = 0.2, in the steady state peers have a non-negligible probability 
to reach degrees near the average degree (k) = 200. Conversely, in the latter 
setting ((k) = 800, Figure [S]) the chosen value of a does not permit to maintain 
the nodes' desired degree. Similar considerations can be made for the last 
considered setting a = 0.8, <fi — 0.005. In this case, when p = 0.2 a peak 
is obtained on the degree probability for the average value 200. Hence, the 
network topology is maintained for p = 0.2, but not for p = 0.8. These results 
once again confirm that the value of a must be properly tuned based on the 
average nodes' desired degree and the failure rate. 
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Figure 4: Degree probability varying a, 0; results obtained through simu- 
lation (Sim) and the mathematical modeling (Th); Random Graph model 
p = 0.2, |n| = 1000, (k) = 200 



RG, <k> = 800, a = 0.1, $ = 0.01 RG, <k> = 800, a = 0.1, <|> = 0.005 
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Figure 5: Degree probability varying a, <p; results obtained through simu- 
lation (Sim) and the mathematical modeling (Th); Random Graph model 
p = 0.8, |n| = 1000, (k) = 800 
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Figure 6: Diameter and average number of first neighbours of random graphs, 
when varying <j>, calculated using Equation (fTTj) 



Figure [5] shows the estimated diameter (and average number of first and 
second neighbours Z\, z<x) of the considered random graphs, obtained when a = 
0.5, while varying cj>, again assuming a network composed of 1000 nodes. Similar 
considerations can be made with respect to those made for uniform graphs. 
That is, the diameter grows with <p, hence confirming that a proper attachment 
rate must be employed to face with failures and guarantee that contents can be 
rapidly disseminated through the overlay, whatever the communication strategy 
employed on top of it. 

4.4 Degree Distribution of Scale Free Networks 

Scale free networks gained a lot of interest in recent years, since it has been 
empirically noticed that power law degree distributions Dk ~ k~ a arc quite good 
to model several types of real networks [7J [T31 HH [Ml HI HS1 123] • These networks 
are often referred as scale- free networks [351 d] • They are characterized by the 
presence of hubs, i.e. nodes with degrees higher than the average, that have 
an important impact on the connectivity of the net. Several works assert that 
scale-free networks are quite resilient to random node faults, due to the presence 
of hubs IH[3S|. Indeed, the majority of nodes are those with small degree; thus, 
it is more likely that these ones will fail, while the probability that all hubs are 
eliminated is almost negligible. 

The interest on scale-free networks in this work relates to the fact that 
several peer-to-peer systems are indeed scale- free networks. Gnutella is a main 
example Moreover, other peer-to-peer architectures exploit super-peers, 
which strongly resemble those hubs of scale-free networks [21 [551 1321 121] • 

To build scale- free networks, our simulator implements a construction method 
which has been proposed in [3] . The interesting aspect of this algorithm is that 
it differs from other proposals, which build networks with a power law distri- 
bution by continuously adding novel nodes and edges, hence having networks 
that grow in time [5] . Conversely, the method in [3J employs a network of 
fixed size, characterized by two parameters a, b. Given a, 6, a network is built 
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Figure 7: Degree Distribution of some scale-free networks using the construction 
method proposed in [5] 



whose number of nodes depends on these two parameters. More specifically, the 
number of nodes y which have a degree x is [%J ■ Thus, the total number of 
nodes of the generated network is 

x—l 

being [e^J the maximum possible degree of the network, since it must be that 
< log y = a — blogx. Once the number of nodes and their degrees have 
been determined, edges arc randomly created among nodes until reaching their 
desired degrees. We remind that, for each node in the network, such an initial 
degree is set as the desired degree dd of the node. 

Figure [7] shows some examples of networks built with our simulator, im- 
plementing the construction method proposed in [3J. In particular, the chart 
reports, for three different settings of a, 6, the number of nodes which have a 
given degree, in a log-log scale. It is possible to appreciate how such distribu- 
tions are almost linear in a log-log scale, hence confirming they all follow some 
power law function. 

Next Figures [5KT21 show the resulting degree distribution obtained through 
the analytical model and through simulation, when employed over scale-free 
networks. For each setting, we report the degree distribution both in a linear 
scale (with the cumulative probability) and in a log-log scale. The latter type of 
charts allows to easily understand whether in the steady state the network main- 
tains scale-free properties (i.e. networks have a power law degree distribution) 
when running the distributed protocol. In this case, five different types of net- 
works are considered, obtained by employing the following pairs of parameters, 
i.e. a = 3, b = 0.5 (forming scale-free networks with a number of nodes |II| = 777, 
Figure©, a = 4.5, b = 0.8 (|II| = 876, Figure©, a = 5, b = 0.9 (|II| = 1079, Fig- 
ure HUJ), a = 3.2,6 = 0.5 (|II| = 1167, Figure [Til), a = 3.2,6= 0.45 (|U| = 2196, 
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Figure [T2"]) . For these networks, values of a, <f> were varied. 

Results show that indeed scale-free properties are maintained, in the steady 
state, when high attachment rates are selected (see the two last scenarios in the 
various figures, with = 0.005, while a = 0.5,0.8, respectively). Conversely, 
values of a reported in the first two scenarios of each figure (a = 0.1, cf> = 
0.01, 0.005) demonstrate that when the attachment rate is not sufficiently rapid 
to repair failures, the typical topology of a scale-free network is lost. In fact, the 
degree distribution in the log-log scale is not linear. These results are common 
to all the considered networks. 

The reliability of scale-free nets was already demonstrated in other works 
[T3l l36l SJ [17]. However, they usually considered attacks while keeping the 
network almost static, without the possibility to react to these nodes/links re- 
movals. (The main reason is that these models are often employed for studying, 
for instance, the spread of viruses or general percolation properties in a net.) 
Our assessment demonstrates that the simple proposed distributed protocol en- 
ables the maintenance of scale-free topologies also when nodes are subjected to 
periodical failures. Once the desired topology of the network has been specified 
and each node has its own assigned degree, it suffices to employ an adequate 
attachment rate to randomly select novel neighbours. As already mentioned, 
when nodes are randomly selected to fail, there is a low probability that a major 
portion of hubs of the network is removed from the net (since there arc few hubs 
in the network, with respect to other nodes) [351 [35] ■ Rather, it is more likely 
that peers which fail are non-hubs with low degrees. Under these circumstances, 
hubs that lose some neighbours have time to react to these failures by finding 
novel nodes to link with. This allows to maintain a scale-free topology. 

Finally, Figure [13] reports the estimated diameter (together with the average 
number of first and second neighbours 21,22) of scale- free networks, built with 
a = 3.2, b = 0.5, |II| = 1167, obtained when a — 0.5, while varying (j>. Also in 
this case the diameter of the network grows with <f>. It is worth noting that, as 
discussed, in this case the desired topology of these networks is different from 
that considered for random graphs, being the former a desired topology following 
a power law distribution, while the latter follows a Poisson distribution. Our 
results show that, with these settings, the average number of first neighbours 
z\ is (slightly) lower in scale-free networks (even if the number of nodes in the 
considered network is a bit higher than the 1000 nodes of random graphs). 
It is interesting to observe that theoretical results on scale free nets revealed 
that, depending on the exponent of the power law characterizing the scale free 
net, the network diameter ranges from log | II | / log log | II | down to log log |TI| 
[321 rT5] [Ml IS]- In this case, it is worth noticing that the network diameter of 
the resulting overlay augments with </>, thus confirming that if the attachment 
rate at a peer is not sufficient, the overlay loses the characteristics of the desired 
topology. 

5 Conclusions 

This paper presented a mathematical model of unstructured, self-organizing 
overlay networks in faulty peer-to-peer systems. A distributed protocol has 
been considered, where nodes try to maintain a desired degree, coping with node 
failures. An analysis of the protocol has been provided, and numerical results 
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Figure 8: Degree probability and cumulative degree probability varying a, <j> 
on the left side; degree probability in log scale on the right side; results ob- 
tained through simulation (Sim) and the mathematical modeling (Th); Scale 
Free networks a — 3, b = 0.5, | IT | = 777 
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Figure 9: Degree probability and cumulative degree probability varying a, <fi 
on the left side; degree probability in log scale on the right side; results ob- 
tained through simulation (Sim) and the mathematical modeling (Th); Scale 
Free networks a = 4.5, b = 0.8, |IT| = 876 
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Figure 10: Degree probability and cumulative degree probability varying a, 4> 
on the left side; degree probability in log scale on the right side; results ob- 
tained through simulation (Sim) and the mathematical modeling (Th); Scale 
Free networks a = 5, b = 0.9, |II| = 1079 
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Figure 11: Degree probability and cumulative degree probability varying a, (j) 
on the left side; degree probability in log scale on the right side; results ob- 
tained through simulation (Sim) and the mathematical modeling (Th); Scale 
Free networks a = 3.2, b = 0.5, |II| = 1167 
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Figure 12: Degree probability and cumulative degree probability varying a,(f) 
on the left side; degree probability in log scale on the right side; results ob- 
tained through simulation (Sim) and the mathematical modeling (Th); Scale 
Free networks a = 3.2, b = 0.45, |II| = 2196 
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Figure 13: Diameter and average number of first neighbours of scale-free net- 
works, when varying </>, calculated using Equation 



coming from the obtained mathematical tool have been compared with those 
obtained through simulation. In essence, the two different approaches provide 
same outcomes. Different types of network topologies have been considered, 
i.e. networks with nodes having the same desired degree, random graphs and 
scale-free networks. 

Results demonstrate that in presence of a non-negligible failure rate, peers 
need a high attachment rate to cope with node faults. Otherwise, they are not 
be able to maintain their desired degree. This is important also to control the 
topology of the evolving network. Hence, a final remark is that the mathematical 
tool provided in this paper can be factually exploited to dynamically adapt the 
peers' attachment rate, based on their desired degree and on the failure rate 
they are experiencing, so as the preserve the desired topology of the network. 

The provided model can be extended in several ways. In this model, peers 
were treated uniformly, all having the same failure and attachment rates. A 
possibility is to replace a, <p parameters with functions that may depend on 
several factors like, for instance, the gap between the actual and the desired 
degree, the actual degree itself, etc. When applied to the attachment rate, 
these parameters would implement some form of preferential attachment. When 
applied to the failure rate, forms of targeted attacks may be modeled. Then, 
the random selection of novel neighbours could be replaced with mechanisms 
that employ a local search, e.g. by limiting the peers' selection over 2 nd , or 3 rd 
neighbours. 
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